Introduction

Column

Tempo Distribution differences

Energy vs Valence

Column {data-width= 400}

Information about the corpus and some graphs

Corpus information and motivation

The corpora I chose are playlists from Spotify. These are playlists I really often listen to when I am studying or travelling for example. It consists of a lot of different kind of music and everytime it has something I really enjoy listening to. The playlists are the Spotify playlist called “ALL OUT OF 2010s” and “ALL OUT OF 2000s”. I think there is alot of different songs/artists/genres etc. There are a few artists who have released very big and many songs and appear more often in this list so that will also lead to interesting results. The artists are mainly Adele, Justin Bieber, Shawn Mendes and Eminem.

What I am eager about is, is to find new correlations or insights that you can not see or hear when you are listening to the playlist. The 2010s saw the emergence of new genres, such as trap and EDM, as well as the continued popularity of established genres like pop, rock, and hip-hop. This diversity of musical styles offers a rich and varied corpus for analysis and exploration.

Outliers

To find any correlations or other insights into these corpora, we focus on the two longest songs in both playlists. This is because this has the greatest chance for the difference in notes or other changes over the years.

Consequently, the two songs discussed in this dashboard are “Mirros by Justin Timberlake” from the 2000s and “Stan by Eminem” from the 2010s. I use these songs for all my plots and visualizations and share my findings about them.

Graphs

But first, we briefly compare the two corpora in the two plots here on the left. We see that, in general, the songs from 2010s have a faster Tempo. This is also expected with the rise of house music and EDM. What is very noticeable is that the energy of the 2010 playlist is on average lower than that of the songs from the 2000 playlist.

Chromagrams

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the chromagrams

In the chromagrams we see clearly which notes occur most in the two songs, what is immediately noticeable is that there is a big difference between the notes of the two songs, the song of Eminem uses mainly the low notes which is more expected in rap and the song Mirrors uses mainly the higher (happier) notes. Mirrors also uses more minor notes than Stan.

Ceptograms

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the ceptograms

As can be seen, there is not really a very big and obvious difference between the two plots. What is noticeable is that c02 and c03 is very high in the beginning with Stan and in the end which makes sense because that has more pace than the middle section and it also ends with more pace again. With Mirrors, c01 through c03 is quite present, which is quite expected because the song has a very similar tempo and mostly the same notes as well.

Self-Similarity Matrices

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the SSM

We can see with both Mirrors and Stan exactly when almost most of the instruments are used at once. Which is an interesting occurrence because with Stan you hear this quite clearly at the beginning but with Mirrors it is more difficult to figure out exactly when this happens. This is probably because the song uses more instruments than Stan anyway.

Chordograms

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the chordograms

There is an obvious difference in these plots. And also confirms expectations. Mirrors is using a lot more notes (keys) than Stan. And there is a moment when everything comes at once in Stan which is quite obvious in the plot (around 300 s). This also shows a difference between rap and more cheerful song with higher danceability. What also leads from Stan’s plot is that the song is mostly in Major notes, which was also confirmed earlier.

Tempograms/Dendrograms

Column

Dendrogram clustering

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the dendrogram and tempograms

Here we have the hierarchical cluster of the 2010 corpus. I used the “complete” method in the code otherwise the dendrogram was much more inefficient (it isn’t best now too but I can not seem to fix that..).

I also have a problem with the tempograms of the songs, because if I load the code separately the code works fine, but when I knit the project it keeps in a loop and eventually stops knitting… I really don’t know how to fix that. I do have the code in the comments in de .Rmd file.

Conclusion and Thoughts

What we can conclude from all the plots and the dashboard is that while there are differences between the two playlists and years, there is too little time difference to see these differences properly and say certain conclusions. However, there are clear differences between the two tracks I used for this dashboard.

Perhaps for future work, clear differences can also be seen between the shortest two songs of the two playlists. In addition, it might also be useful to take the “all out of 80s” playlist instead of the 2000s playlist. There are many more years between the 80s and the 2010s, and thus the differences will probably be easier to see. This is an interesting possibility for a follow-up study.

---
title: "Computational Musicology Dashboard 2023"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    orientation: columns
---


```{r setup, include=FALSE}
library(flexdashboard)
library(plotly)
library(tidymodels)
library(tibble)
library(ggdendro)
library(stringr)
library(purrr)
library(scales)
library(tidyr)
library(remotes)
library(spotifyr)
library(ggplot2)
library(dplyr)
library(compmus)

all_out_2010 <- get_playlist_audio_features("", "37i9dQZF1DX5Ejj0EkURtP")
all_out_2000 <- get_playlist_audio_features("", "37i9dQZF1DX4o1oenSJRJd")
```

Introduction 
================

Column {.tabset}
--------------------------------

### Tempo Distribution differences
```{r}
plot <- ggplot(height=400, width=500) +
  geom_density(data = all_out_2010, aes(x=tempo, fill="All Out 2010s"), alpha=0.5) +
  geom_density(data = all_out_2000, aes(x=tempo, fill="All Out 2000s"), alpha=0.5) +
  labs(title="Tempo Distribution", x="Tempo", y="Density") +
  scale_fill_manual(values=c("blue", "green"))

# Convert the ggplot object to a plotly object
plot <- ggplotly(plot)
plot
```

### Energy vs Valence
```{r}
ggplotly(
  ggplot() +
    geom_point(data = all_out_2010, aes(x=energy, y=valence, color="All Out 2010"), alpha=0.5) +
    geom_point(data = all_out_2000, aes(x=energy, y=valence, color="All Out 2000"), alpha=0.5) +
    labs(title="Energy vs. Valence", x="Energy", y="Valence") +
    scale_color_manual(values=c("blue", "green"))
)
```

Column {data-width= 400}
----------------------

### Information about the corpus and some graphs

<h2>Corpus information and motivation</h2>
The corpora I chose are playlists from Spotify. These are playlists I really often listen to when I am studying or travelling for example. It consists of a lot of different kind of music and everytime it has something I really enjoy listening to. The playlists are the Spotify playlist called "ALL OUT OF 2010s" and "ALL OUT OF 2000s". I think there is alot of different songs/artists/genres etc. There are a few artists who have released very big and many songs and appear more often in this list so that will also lead to interesting results. The artists are mainly Adele, Justin Bieber, Shawn Mendes and Eminem. 

What I am eager about is, is to find new correlations or insights that you can not see or hear when you are listening to the playlist. The 2010s saw the emergence of new genres, such as trap and EDM, as well as the continued popularity of established genres like pop, rock, and hip-hop. This diversity of musical styles offers a rich and varied corpus for analysis and exploration.

<h3>Outliers</h3>
To find any correlations or other insights into these corpora, we focus on the two longest songs in both playlists. This is because this has the greatest chance for the difference in notes or other changes over the years.

Consequently, the two songs discussed in this dashboard are "Mirros by Justin Timberlake" from the 2000s and "Stan by Eminem" from the 2010s. I use these songs for all my plots and visualizations and share my findings about them.

<h3>Graphs</h3>
But first, we briefly compare the two corpora in the two plots here on the left. We see that, in general, the songs from 2010s have a faster Tempo. This is also expected with the rise of house music and EDM. What is very noticeable is that the energy of the 2010 playlist is on average lower than that of the songs from the 2000 playlist.

Chromagrams
============================================

Column {.tabset}
--------------------------------

### Mirrors by Justin Timberlake

```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |>
  select(segments) |>
  unnest(segments) |>
  select(start, duration, pitches)

mirrors |>
  mutate(pitches = map(pitches, compmus_normalise, "euclidean")) |>
  compmus_gather_chroma() |> 
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = pitch_class,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  theme_minimal() +
  scale_fill_viridis_c()
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |>
  select(segments) |>
  unnest(segments) |>
  select(start, duration, pitches)

stan |>
  mutate(pitches = map(pitches, compmus_normalise, "euclidean")) |>
  compmus_gather_chroma() |> 
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = pitch_class,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  theme_minimal() +
  scale_fill_viridis_c()
```

Column {data-width= 400}
----------------------

### Text about the chromagrams
In the chromagrams we see clearly which notes occur most in the two songs, what is immediately noticeable is that there is a big difference between the notes of the two songs, the song of Eminem uses mainly the low notes which is more expected in rap and the song Mirrors uses mainly the higher (happier) notes. Mirrors also uses more minor notes than Stan.


Ceptograms
============================================

Column {.tabset}
--------------------------------

### Mirrors by Justin Timberlake
```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

mirrors |>
  compmus_gather_timbre() |>
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = basis,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  scale_fill_viridis_c() +                              
  theme_classic()
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

stan |>
  compmus_gather_timbre() |>
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = basis,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  scale_fill_viridis_c() +                              
  theme_classic()
```

Column {data-width= 400}
----------------------

### Text about the ceptograms
As can be seen, there is not really a very big and obvious difference between the two plots. What is noticeable is that c02 and c03 is very high in the beginning with Stan and in the end which makes sense because that has more pace than the middle section and it also ends with more pace again. With Mirrors, c01 through c03 is quite present, which is quite expected because the song has a very similar tempo and mostly the same notes as well.



Self-Similarity Matrices
==============================

Column {.tabset}
--------------------------------

### Mirrors by Justin Timberlake
```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

mirrors |>
  compmus_self_similarity(timbre, "cosine") |> 
  ggplot(
    aes(
      x = xstart + xduration / 2,
      width = xduration,
      y = ystart + yduration / 2,
      height = yduration,
      fill = d
    )
  ) +
  geom_tile() +
  coord_fixed() +
  scale_fill_viridis_c(guide = "none") +
  theme_classic() +
  labs(x = "", y = "")
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

stan |>
  compmus_self_similarity(timbre, "cosine") |> 
  ggplot(
    aes(
      x = xstart + xduration / 2,
      width = xduration,
      y = ystart + yduration / 2,
      height = yduration,
      fill = d
    )
  ) +
  geom_tile() +
  coord_fixed() +
  scale_fill_viridis_c(guide = "none") +
  theme_classic() +
  labs(x = "", y = "")
```

Column {data-width= 400}
----------------------

### Text about the SSM
We can see with both Mirrors and Stan exactly when almost most of the instruments are used at once. Which is an interesting occurrence because with Stan you hear this quite clearly at the beginning but with Mirrors it is more difficult to figure out exactly when this happens. This is probably because the song uses more instruments than Stan anyway.

Chordograms
=====================================

Column {.tabset}
--------------------------------

```{r}
circshift <- function(v, n) {
  if (n == 0) v else c(tail(v, n), head(v, -n))
}

#      C     C#    D     Eb    E     F     F#    G     Ab    A     Bb    B
major_chord <-
  c(   1,    0,    0,    0,    1,    0,    0,    1,    0,    0,    0,    0)
minor_chord <-
  c(   1,    0,    0,    1,    0,    0,    0,    1,    0,    0,    0,    0)
seventh_chord <-
  c(   1,    0,    0,    0,    1,    0,    0,    1,    0,    0,    1,    0)

major_key <-
  c(6.35, 2.23, 3.48, 2.33, 4.38, 4.09, 2.52, 5.19, 2.39, 3.66, 2.29, 2.88)
minor_key <-
  c(6.33, 2.68, 3.52, 5.38, 2.60, 3.53, 2.54, 4.75, 3.98, 2.69, 3.34, 3.17)

chord_templates <-
  tribble(
    ~name, ~template,
    "Gb:7", circshift(seventh_chord, 6),
    "Gb:maj", circshift(major_chord, 6),
    "Bb:min", circshift(minor_chord, 10),
    "Db:maj", circshift(major_chord, 1),
    "F:min", circshift(minor_chord, 5),
    "Ab:7", circshift(seventh_chord, 8),
    "Ab:maj", circshift(major_chord, 8),
    "C:min", circshift(minor_chord, 0),
    "Eb:7", circshift(seventh_chord, 3),
    "Eb:maj", circshift(major_chord, 3),
    "G:min", circshift(minor_chord, 7),
    "Bb:7", circshift(seventh_chord, 10),
    "Bb:maj", circshift(major_chord, 10),
    "D:min", circshift(minor_chord, 2),
    "F:7", circshift(seventh_chord, 5),
    "F:maj", circshift(major_chord, 5),
    "A:min", circshift(minor_chord, 9),
    "C:7", circshift(seventh_chord, 0),
    "C:maj", circshift(major_chord, 0),
    "E:min", circshift(minor_chord, 4),
    "G:7", circshift(seventh_chord, 7),
    "G:maj", circshift(major_chord, 7),
    "B:min", circshift(minor_chord, 11),
    "D:7", circshift(seventh_chord, 2),
    "D:maj", circshift(major_chord, 2),
    "F#:min", circshift(minor_chord, 6),
    "A:7", circshift(seventh_chord, 9),
    "A:maj", circshift(major_chord, 9),
    "C#:min", circshift(minor_chord, 1),
    "E:7", circshift(seventh_chord, 4),
    "E:maj", circshift(major_chord, 4),
    "G#:min", circshift(minor_chord, 8),
    "B:7", circshift(seventh_chord, 11),
    "B:maj", circshift(major_chord, 11),
    "D#:min", circshift(minor_chord, 3)
  )

key_templates <-
  tribble(
    ~name, ~template,
    "Gb:maj", circshift(major_key, 6),
    "Bb:min", circshift(minor_key, 10),
    "Db:maj", circshift(major_key, 1),
    "F:min", circshift(minor_key, 5),
    "Ab:maj", circshift(major_key, 8),
    "C:min", circshift(minor_key, 0),
    "Eb:maj", circshift(major_key, 3),
    "G:min", circshift(minor_key, 7),
    "Bb:maj", circshift(major_key, 10),
    "D:min", circshift(minor_key, 2),
    "F:maj", circshift(major_key, 5),
    "A:min", circshift(minor_key, 9),
    "C:maj", circshift(major_key, 0),
    "E:min", circshift(minor_key, 4),
    "G:maj", circshift(major_key, 7),
    "B:min", circshift(minor_key, 11),
    "D:maj", circshift(major_key, 2),
    "F#:min", circshift(minor_key, 6),
    "A:maj", circshift(major_key, 9),
    "C#:min", circshift(minor_key, 1),
    "E:maj", circshift(major_key, 4),
    "G#:min", circshift(minor_key, 8),
    "B:maj", circshift(major_key, 11),
    "D#:min", circshift(minor_key, 3)
  )
```

### Mirrors by Justin Timberlake
```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |>
  compmus_align(sections, segments) |>
  select(sections) |>
  unnest(sections) |>
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "mean", norm = "manhattan"
      )
  )

mirrors |> 
  compmus_match_pitch_template(
    key_templates,         # Change to chord_templates if descired
    method = "euclidean",  # Try different distance metrics
    norm = "manhattan"     # Try different norms
  ) |>
  ggplot(
    aes(x = start + duration / 2, width = duration, y = name, fill = d)
  ) +
  geom_tile() +
  scale_fill_viridis_c(guide = "none") +
  theme_minimal() +
  labs(x = "Time (s)", y = "")
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |>
  compmus_align(sections, segments) |>
  select(sections) |>
  unnest(sections) |>
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "mean", norm = "manhattan"
      )
  )

stan |> 
  compmus_match_pitch_template(
    key_templates,         # Change to chord_templates if descired
    method = "euclidean",  # Try different distance metrics
    norm = "manhattan"     # Try different norms
  ) |>
  ggplot(
    aes(x = start + duration / 2, width = duration, y = name, fill = d)
  ) +
  geom_tile() +
  scale_fill_viridis_c(guide = "none") +
  theme_minimal() +
  labs(x = "Time (s)", y = "")
```

Column {data-width= 400}
----------------------

### Text about the chordograms
There is an obvious difference in these plots. And also confirms expectations. Mirrors is using a lot more notes (keys) than Stan. And there is a moment when everything comes at once in Stan which is quite obvious in the plot (around 300 s). This also shows a difference between rap and more cheerful song with higher danceability. What also leads from Stan's plot is that the song is mostly in Major notes, which was also confirmed earlier.

Tempograms/Dendrograms
==================================

Column {.tabset}
--------------------------------

### Dendrogram clustering
```{r}
out2010 <-
  get_playlist_audio_features("", "37i9dQZF1DX5Ejj0EkURtP") |>
  add_audio_analysis() |>
  mutate(
	segments = map2(segments, key, compmus_c_transpose),
	pitches =
  	map(segments,
    	compmus_summarise, pitches,
    	method = "mean", norm = "manhattan"
  	),
	timbre =
  	map(
    	segments,
    	compmus_summarise, timbre,
    	method = "mean"
  	)
  ) |>
  mutate(pitches = map(pitches, compmus_normalise, "clr")) |>
  mutate_at(vars(pitches, timbre), map, bind_rows) |>
  unnest(cols = c(pitches, timbre))

out2010_juice <-
  recipe(
	track.name ~
  	danceability +
  	energy +
  	loudness +
  	speechiness +
  	acousticness +
  	instrumentalness +
  	liveness +
  	valence +
  	tempo +
  	duration +
  	C + `C#|Db` + D + `D#|Eb` +
  	E + `F` + `F#|Gb` + G +
  	`G#|Ab` + A + `A#|Bb` + B +
  	c01 + c02 + c03 + c04 + c05 + c06 +
  	c07 + c08 + c09 + c10 + c11 + c12,
	data = out2010
  ) |>
  step_center(all_predictors()) |>
  step_scale(all_predictors()) |>
  prep(out2010 |> mutate(track.name = str_trunc(track.name, 20))) |>
  juice() |>
  column_to_rownames("track.name")

out2010_dist <- dist(out2010_juice, method = "euclidean")
out2010_dist |>
  hclust(method = "complete") |>
  dendro_data() |>
  ggdendrogram()
```

### Mirrors by Justin Timberlake
```{r}
#mirrors <- get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV")
#
#mirrors |>
#  tempogram(window_size = 8, hop_size = 1, cyclic = TRUE) |>
#  ggplot(aes(x = time, y = bpm, fill = power)) +
#  geom_raster() +
#  scale_fill_viridis_c(guide = "none") +
#  labs(x = "Time (s)", y = "Tempo (BPM)") +
#  theme_classic()
```

### Stan by Eminem
```{r}
#stan <- get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz")
#
#stan |>
#  tempogram(window_size = 8, hop_size = 1, cyclic = FALSE) |>
#  ggplot(aes(x = time, y = bpm, fill = power)) +
#  geom_raster() +
#  scale_fill_viridis_c(guide = "none") +
#  labs(x = "Time (s)", y = "Tempo (BPM)") +
#  theme_classic()
```

Column {data-width= 400}
----------------------

### Text about the dendrogram and tempograms
Here we have the hierarchical cluster of the 2010 corpus. I used the "complete" method in the code otherwise the dendrogram was much more inefficient (it isn't best now too but I can not seem to fix that..).

I also have a problem with the tempograms of the songs, because if I load the code separately the code works fine, but when I knit the project it keeps in a loop and eventually stops knitting... I really don't know how to fix that. I do have the code in the comments in de .Rmd file.


Conclusion and Thoughts
==================================
What we can conclude from all the plots and the dashboard is that while there are differences between the two playlists and years, there is too little time difference to see these differences properly and say certain conclusions. However, there are clear differences between the two tracks I used for this dashboard.

Perhaps for future work, clear differences can also be seen between the shortest two songs of the two playlists. In addition, it might also be useful to take the "all out of 80s" playlist instead of the 2000s playlist. There are many more years between the 80s and the 2010s, and thus the differences will probably be easier to see. This is an interesting possibility for a follow-up study.